Modular Deep Reinforcement Learning for Continuous Motion Planning With Temporal Logic

نویسندگان

چکیده

This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) with unknown transition probabilities over continuous state and action spaces. Linear temporal logic (LTL) is used to specify high-level tasks infinite horizon, which can be converted into a limit deterministic generalized B\"uchi automaton (LDGBA) several accepting sets. The novelty design an embedded product MDP (EP-MDP) between LDGBA incorporating synchronous tracking-frontier function record unvisited sets automaton, facilitate satisfaction conditions. proposed LDGBA-based reward shaping discounting schemes for model-free reinforcement learning (RL) only depend on EP-MDP states overcome issues sparse rewards. Rigorous analysis shows that any RL method optimizes expected discounted return guaranteed find optimal policy whose traces maximize probability. A modular deep gradient (DDPG) then developed generate such policies performance our framework evaluated via array OpenAI gym environments.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic...

متن کامل

Benchmarking Deep Reinforcement Learning for Continuous Control

Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs. However, it has been difficult to quantify progress in the domain of continuou...

متن کامل

Temporal logic motion planning for dynamic robots

In this paper, we address the temporal logic motion planning problem for mobile robots that are modeled by second order dynamics. Temporal logic specifications can capture the usual control specifications such as reachability and invariance as well as more complex specifications like sequencing and obstacle avoidance. Our approach consists of three basic steps. First, we design a control law th...

متن کامل

Motion Planning for the ATHLETE Rover with Reinforcement Learning

Legged locomotion is attractive because it can enable a robot to traverse far more varied terrain than a wheeled rover is capable of. In the context of planetary exploration, this is especially attractive as the sites of greatest scientific interest tend to be characterized by difficult terrain. For example, it would be extremely difficult for a wheeled rover to make its way into a lunar crater...

متن کامل

Just-in-time synthesis for motion planning with temporal logic

The cost of the great expressivity of motion planning subject to temporal logic formulae is intractability. Recent advances in sampling-based methods seem to be only applicable to “low-level” control. The problem of realizing “high-level” controllers that satisfy a temporal logic specification does not readily admit approximations, unless the notion of correctness is relaxed as might be achieve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics & automation letters

سال: 2021

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2021.3101544